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Abstract 

An  important  component  of  mechanical  theorem  proving  systems  are  unification  algorithms 
which  find  most  general  substitutions  which,  when  applied  to  two  expressions,  make  them 
equivalent.  Functions  which  are  associative  and  commutative  (such  as  the  arithmetic  addition 
and  multiplication  functions)  ere  often  the  subject  of  mechanical  theorem  proving.  An 
algorithm  which  unifies  terms  whose  function  is  associative  and  commutative  is  presented  here. 
The  algorithm  eliminates  the  need  for  axiomatizing  the  associativity  and  commutativity 
properties  and  returns  a complete  set  of  unifiers  without  recourse  to  the  indefinite  generation 

of  variants  and  instances  of  the  terms  oeing  unified  required  by  previous  solutions  fo  the 
problem. 
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Introduction 


At  th.  core  of  m.0),  fheor.m  proving  systems  it  . unification  algor, 'hm  which  returns 
for  , pair  .1  inpul  expression,  . ,.t  of  undying  tub.tilutions,  assignment,  to  th.  v.n.bl.t  ot 
the  expressions  which  make  ,h.  two  expressions  equivalent  Typical  „ th.  un, licet, on 
algorithm  of  Robinson  [6J  for  unifying  elomic  formule,  ot  th.  Ii„,  order  predict,  calculus  ,n 


resolution  theorem  proving  [1] 

This  work  treats  the  case  of  unifying  terms  ot  th.  first  order  predict,  calculus  where 
the  function  is  associative  snd  commutative  Such  functions  are  mathematically  important  and 
thus  ot  interest  to  developer,  of  theorem  proving  program,  E„„p|„  tuch 
the  arithmetic  addition  and  multiplication  functions. 

The  case  where  th.  fun, bon  is  simply  commutative  handled  by  a trivial 

extension  !o  Robinson',  unification  algorilhm  which  unities  the  arguments  of  on.  term  against 
permutations  of  the  arguments  of  the  other  term. 


The  case  where  the  function  ,s  simply  associative  „ difficult  and  w.  know  ot  no 

general  solution  Suggest,,,  of  th.  d, If, cully  of  this  problem  ,s  the  lad  that  there  may  be  an 
ml, ml.  number  of  umli.rs  tor  a pair  o.  terms.  Fo,  example,  ,h.  lerms  fix.)  and  where  I 
is  associative,  , is  , constant,  and  , is  . unlli.rs  „,lh  

(W.  represent  the  argument  l„tr  of  associative  function,  with  no  extra  parentheses,  i.e, 
f(abc)  rattier  than  f(af(bc))  or  f(f{ab)c).) 


Two  principal  approach,,  lo  handling  assoc, at,,, ly  or  commutativity  are  available  Th. 
first,  standard  approach  is  to  represent  the  term,  conventionally,  i.e,  l(,f(bc»  or  l(f(,b)c) 
rather  than  l(abc),  and  axiomatis.  the  associativity  or  commutativity  property  Th. 
associativity  a„„m  would  be  l(,l(yx)W(l(.y|y,  ,„d  commutativity  would  be 

'l«/)rf(yx).  These  axioms  could  be  applied  using  some  equality  i„f.tenc,  ru,es  such  „ 
paramodulation  [5] 


The  second  approach  represent,  associative  (unctions  as  (unction,  with  an  arbitrary 
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number  of  arguments,  i.e ,,  uses  f(abc)  rather  than  f (ef (be))  or  f (f(ab)c).  Special  purpose 
unification  algorithms  are  provided  for  terms  whose  functions  are  associative,  commutative,  or 
both  £xamples  of  this  approach  in  first  order  predicate  calculus  theorem  proving  are  the 
work  of  Nevms  [2]  and  Slagle  [8],  The  algorithms  for  associativity,  and  for  associativity  and 
commutativity  are  incomplete,  i.e.,  they  fail  to  return  all  the  unifiers  in  some  cases.  An 
example  of  this  approach  m the  area  of  programming  languages  for  problem  solving  is  the  use 
of  the  associative  data  type  tuple  or  sector  and  associative  and  commutative  data  type  bag  in 
the  QA4  and  QUSP  languages  (7,4].  Again,  in  this  case  the  algorithms  for  pattern  matching 
(unifying)  these  expressions  are  incomplete.  In  both  these  cases,  the  incomplete  algorithms 
ran  be  augmented  by  a process  which  alters  the  input  expressions  to  cause  the  unification 
algorithm  applied  to  the  altered  expressions  to  return  additional  unifiers.  The  addition  of  this 
process  (Slagle's  widening  operation  for  the  first  order  predicate  calculus  [8]  and  Shekel's 
variable  splitting  operation  for  expressions  of  QA4  and  QUSP  [9])  results  in  completeness. 
Widening  and  variable  splitting  are  both  operations  that  must  be  performed  on  one  or  both 
input  expressions  an  arbitrary  number  of  times,  replacing  single  variables  of  the  expressions 
uniformly  bv  two  variables;  it  is  essentially  (repeated)  paramoduiation  by  the  functionally 
reflexive  axiom. 

An  example  of  the  latter  approach  is  the  unification  of  f(abz)  and  f(xy)  where  f is 
associative  and  commutative.  The  special  purpose  unification  algorithm  would  return  the 
unifiers  {x*-a,  y«-f(bz)},  {x«-b,  yM(az)},  {x*-z,  y*-f(ab)},  {xM(bz),  y*-a],  (xH(az),  y*-b),  and 
|x«-f(ab),  y«-z}.  But  this  is  an  incomplete  set  of  unifiers  since  the  possibility  that  tine  value  of 
z is  not  wholly  contained  in  either  the  value  of  x or  the  value  of  y is  not  represented.  After 
performing  a widening  operation  on  f(abz)  resulting  in  ((abz^)  by  instantiating  z by  f (zj Z2^> 
additional  new  unifiers  such  as  |x«-f(azj),  yMO^),  z*-f (zj Z2>}  and  {xH(abzj),  y«-Z2» 
z*-f (z  | Z2 )}  are  returned  by  the  unification  algorithm. 

Related  to  this  approach,  though  different  in  detail,  is  Plotkin’s  work  on  the  theory  of 


2 


building  in  equational  theories  [3]  of  which  associativity  and  commutativity  ara  axamplas  In 
the  casa  cl  associativity,  Plotkin  retains  tarms  in  a normal  form:  right  associativa  form, 
although  it  could  aquivalently  hava  baan  our  unparenthesized  form  His  aquivalent  of  tha 
widening  rule,  tha  raplacament  of  a variabla  by  two  new  variables,  is  applied  continually 
inside  the  unification  algorithm  rather  than  being  used  outside  it.  Thus  his  unification  algorithm 
may  generate  an  nfimte  number  of  unifiers  as  opposed  to  a unification  algorithm  guaranteed 
to  produce  a finite  number  of  unifiers  and  a potentially  infinite  process  (widening)  for  altering 
to  the  unification  algorithm  to  obtain  additional  unifiers.  The  different io  approaches 
seems  to  be  principally  one  of  organization  of  the  search  process 

In  this  paper,  we  present  a new  special  purpose  unification  algorithm  which  we  call  the 
AC  unification  algorithm  for  terms  whose  functions  are  associative  and  commutative  which 
returns  a complete  set  of  unifiers.  This  algorithm  eliminates  the  need  for  axiomatizing 
associativity  and  commutativity  and  also  eliminates  the  cost  of  continually  applying  these 
axioms  which  often  results  in  much  unnecessary  or  redundant  computation.  It  also  eliminates 
tne  need  for  using  the  process  of  widening  or  variable  splitting  whose  necessity  (for 
discovering  a complete  set  of  unifiers  in  the  case  of  unifying  any  particular  pair  of 
expressions)  is  difficult  to  ascertain. 


Termmolofv 


Definition.  A term  is  defined  to  be 

(1 ) a constant, 

(2)  a variable,  or 

(3)  a function  symbol  succeeded  by  a list  of  lerms  (the  ary uments  of  the  function). 

We  shall  use  the  symbols  a,  b,  and  c to  represent  constants,  x,  y,  and  z (possibly 
indexed)  \o  represent  variables,  and  f to  represent  a function  which  is  associative  and 
commutative 
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Definition.  A substitution  component  is  an  ordered  pair  of  a variable  v and  a term  t 
written  as  v«-t.  A substitution  component  denotes  the  assignment  of  the  term  to  the  variable 
or  the  replacement  of  the  variable  by  the  term. 

Definition.  A substitution  is  a set  of  substitution  components  with  distinct  first  elements, 
t o,  distinct  variables  being  substituted  for  Applying  a substitution  to  an  expression  results  in 
the  replacement  of  those  variables  of  the  expression  included  among  the  first  elements  of  the 
substitution  components  jy  the  corresponding  terms.  The  substitution  components  are  applied 
to  the  expression  in  parallel  and  no  variable  occurrence  in  the  second  element  of  a 
substiti  Son  component  will  be  replaced  even  if  the  variable  occurs  as  the  first  element  in 
another  substitution  component.  Substitut  ons  will  be  represented  by  the  symbols  a and  9. 
The  application  of  substitution  9 to  expression  \ is  denoted  by  A#  The  composition  of 
substitutions  denotes  the  substitution  whose  effect  is  the  same  as  first  applying  substitution 
#,  then  applying  substitution  9,  i.e.,  A (#*)  « (A 9)9  for  every  expression  A 

Balinilifln.  A unifying  substitution  or  uoilitc  of  two  expressions  is  a substitution  which 
when  applied  to  the  two  expressions  results  in  equivalent  expressions.  In  ordinary  unification, 
two  expressions  are  equivalent  if  and  only  if  they  are  identical.  In  unification  of  argument  lists 
of  commutative  functions,  two  expressions  are  equivalent  if  they  have  the  same  function 
symbol  and  the  same  arguments  in  the  tame  or  different  order. 

DflllTitian  Term  s is  an  instance  of  term  t,  and  t is  a generalization  of  s,  if  there  exists 
a substitution  9 such  that  tf*s. 

Similarly,  substitution  9 is  an  instance  (generalization)  of  substitution  9 if,  for  every 
term  t,  \9  is  an  instance  (generalization)  of  f*. 

Ihfl.AC  UniLulifla  Algorithm 

We  present  here  an  algorithm  for  unifying  two  terms  whose  function  is  associative  and 
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commutative  Terms  will  be  represented  as  if  the  function  had  an  arbitrary  number  of 
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arguments  with  no  superfluous  parentheses. 

We  will  assume  that  the  argument  lists  of  the  two  terms  being  unified  have  no  common 
arguments.  This  presents  no  difficulty  since  no  unifiers  are  lost  and  efficiency  is  gained  if 
common  arguments  are  eliminated  immediately.  This  it  done  by  removing  common  arguments 
a pair  at  a time,  one  from  each  of  the  argument  lists.  For  example,  before  unifying  f(xxyabc) 
and  f(bbbcz),  the  b's  common  to  the  two  terms  are  removed  yielding  f(xxyac)  and  f(bbcz),  and 
the  c’s  common  to  the  two  new  terms  are  removed  yielding  f(xxya)  and  f(bbz).  An  example  of 
the  utility  of  immediately  removing  common  arguments  is  the  unification  of  f(g(x)y)  and 
f(g(x)g(a)).  If  the  g(x)'s  common  to  the  two  terms  are  immediately  removed,  the  unification 
algorithm  will  return  the  moit  general  unifier  (y*-g(a)}.  If  the  common  g(x)'s  are  retained, 
unification  will  likely  result  in  the  generation  of  the  additional  less  general  unifier 
{x*-a,y-g(a)}. 

The  algorithm  will  be  expressed  partially  in  terms  of  an  algorithm  for  the  complete 
unification  of  terms  with  an  associative  and  commutative  function  with  only  variables  as 
arguments.  The  result  of  unifying  such  terms  is  an  assignment  to  each  variable  of  the  terms 
some  sequence  of  terms.  Each  variable  is  assigned  a term  tj  (v/hose  function  symbol  is  not  f) 
or  a term  f(tjRl...tmnm)  (with  nj  occurrences  of  term  tj  as  arguments  of  f).  For  such  an 
assignment  to  be  a unifier,  the  only  requirement  is  that  for  each  term  tj  used  in  any 
assignment  there  are  the  same  number  of  occurrences  of  that  term  occurring  as  arguments  of 
f in  each  of  the  unified  terms  instantiated  by  the  assignment.  For  example,  in  unifying 
f(xjX]X2X3)  a d ffyjy^),  if  term  t is  part  of  some  assignment  to  one  of  the  variables,  then 
2 times  the  number  of  occurrences  of  t in  the  assignment  for  xj  plus  the  number  of 
occurrences  of  t in  the  assignment  for  X2  plus  the  number  of  occurrences  of  t in  the 
assignment  for  X3  must  equal  2 times  the  number  of  occurrences  of  t in  the  assignment  for  yj 
plus  the  number  Of  occurrences  of  t in  the  assignment  for  y2-  For  example,  {xjH(bb), 
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X2*'<(ab),  X3*-a,  y j *-b,  y2*-f(aabbb)}  is  a unifier  of  f (m | x | X2X3)  and  Hy j y j y 2 ) since  there  are 
2 a's  and  5 b's  in  the  instantiations  of  f (x j x j ><2X3)  and  flyjy^)  the  unified  term  being 
f(aabbbbb) 


With  each  pair  of  terms  with  an  associative  and  commutative  function  with  only  variable 
arguments  is  associated  a single  equation  representing  the  number  and  multiplicity  ol  variables 
in  each  term.  For  example,  the  equation  2x  j *x 2 *><3  a ^‘y^  is  associated  with  the  pair  of 
terms  given  above.  This  equation  succintly  represents  the  condition  for  a substitution  to  be  a 
unifier:  that  the  sum  of  the  number  of  occurrences  of  any  t irm  in  the  value  of  each  variable 
multiplied  by  the  multiplicity  of  the  variable  in  the  term  must  be  equal  for  the  two  terms. 

Non-negative  integral  solutions  to  such  equations  can  be  used  to  represent  unifiers. 
The  solutions  must  be  non-negative  integral  since  each  variable  must  be  assigned  a 
non-negative  integral  number  of  occurrences  of  each  term. 

In  order  to  generate  all  the  solutions  to  the  problem  of  unifying  the  two  terms,  it  is 
necessary  to  be  able  to  represent  all  the  solutions  to  the  equation  derived  from  the  terms. 
Every  non-negative  integral  solution  to  the  equation  is  representable  as  a sum  of  elements  of 
a particular  finite  set  of  non-negative  integral  solutions  to  the  equation,  i.e.,  every 
non-negative  integral  solution  to  the  equation  is  a sum  (equivalently,  a sum  with  non-negative 
integral  weights)  of  elements  of  a particular  finite  set  of  non-negative  int  egral  solutions.  The 
finite  set  of  non-negative  integral  solutions  by  whose  addition  the  entirw  non-negative  integral 
solution  space  is  spanned  is  generable  by  generating  in  ascending  order  of  value  solutions  to 
the  equation,  eliminating  those  solutions  composable  from  those  previously  generated.  This 
process  can  be  made  finite  by  placing  a bound  on  the  maximum  solution  value  which  will  be 
used;  such  a maximum  is  proved  in  a later  lemma  to  eliminate  no  needed  solutions. 
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Consider  the  equation  2xj  *X2**3  r 2y j *y2-  Solutions  to  the  equation  are: 


41 

42 

43 

*1 

*2 

2*i  1*2^3 

2*11*2 

1 

0 

0 

1 

0 

1 

1 

1 

zl 

2 

0 

1 

0 

0 

1 

1 

1 

*2 

3 

0 

0 

2 

1 

0 

2 

2 

z3 

4 

0 

1 

1 

1 

0 

2 

2 

*4 

5 

0 

2 

0 

1 

0 

2 

2 

z5 

6 

1 

0 

0 

0 

2 

2 

2 

z6 

7 

1 

0 

0 

1 

0 

2 

2 

z7 

Associated  with  each  solution  above  is  a new  variable  (in  the  rightmost  column)  The 
assignment  of  as  many  occurrences  of  that  variable  as  specified  in  the  solution  to  each  of  the 
variables  of  the  original  term  results  in  a partial  solution  to  the  unification  of  the  the  original 
terms.  In  particular,  the  assignment  of  2 occurrences  of  variable  Z3  to  X3  and  1 occurrence  to 
yj  results  in  an  equal  number  of  occurrences  of  variable  23  in  each  of  f(xjx  1*2X3)  8n<* 

♦ (y  1 y2>- 

Every  non-negative  integral  solution  to  the  equation  is  a (non-negative  integer 
weighted)  sum  of  the  7 solutions  presented  above,  i.e.,  every  solution  is  representable  as 

xj-zg^z?,  X2sZ2*Z4*2z5,  X3=zj *223*24,  yiE*3*z4*z5*z7>  /2=zl*z2*^z6  for  some 

non-negative  integral  values  of  z j ,...,Zg . However,  not  every  solution  to  the  equation  is  a 
solution  to  the  unification  problem  for  which  the  equation  was  derived.  There  is  an  additional 
constraint  that  each  variable  of  the  original  terms  must  be  have  at  least  on#  term  in  its  value; 
it  cannot  have  zero  terms  in  its  value. 

Hence,  we  must  form  that  subset  of  the  2^=128  sums  for  which  each  element  of  the 
5-tuple  is  non-zero  (It  is  not  necessary  to  consider  sums  in  which  any  solution  has  a 
coefficient  other  than  0 or  1 since  such  solutions  (in  the  unification  problem)  are  already 
representable  since  the  solution's  inclusion  with  coefficient  1 introduces  a variable  which  can 
have  as  its  value  an  arbitrary  number  of  terms  as  arguments  of  f thus  simulating  the  case  of 
the  coefficient  being  greater  than  1 ) There  are  69  such  sums  including  for  example 
(representing  the  sum  by  the  set  of  its  indices)  {2,3,6},  {1,2, 3, 6},  and  {4,6}  with  associated 
unifiers 
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{Ki«-z6'  x2*"z2»  x3**,<2323,«  yr23'  y2*-f<22z6z6»» 

(k1*"26>  x2*’z2'  x3H(212323''  yrz3'  y 2,",<2 1 222626 #nd 
jx,*-26,  x2*-z4,  x3*-z4,  y|«-24,  y2H(z626)}. 

Note  that  it  a variable  could  have  as  its  value  2ero  terms  rather  than  one  or  more 
terms  as  in  the  first  order  predicate  calculus,  it  would  be  unnecessary  to  form  this  subset  of 
2n  (where  n is  the  number  of  solutions)  sums.  Only  the  sum  of  all  the  solutions  would  be 
required  since  any  variable  present  in  this  sum  could  have  value  2ero,  and  the  variables  in  the 
corresponding  unifier  could  be  matched  against  2ero  terms  This  is  the  situation  with  fragment 
variables  n the  bag  data  type  in  QA4  and  QLISP  [7,4]  (see  [9]). 

To  be  more  precise  in  the  definition  of  the  flitC-ilhm,  the  algor  . ,,m  consists  of  the 
following  steps: 

1.  Form  an  equation  from  the  two  terms  where  the  coefficient  of  each  variable  in  the  equation 
is  equal  to  the  multiplicity  of  the  corresponding  variable  in  the  term. 

2.  Generate  all  non-negative  integral  solutions  to  the  equation  eliminating  all  those  solutions 
composable  from  other  solutions. 

3.  Associate  with  each  solution  a new  v <riable. 

4.  For  each  sum  of  the  solutions  (no  sol  ition  occurring  in  the  sum  more  than  once)  with  no 
zrro  components  assemble  a unifier  composed  of  assignments  to  the  original  variables  with  as 
n any  of  each  new  variable  as  specified  by  thi  solution  element  in  the  sum  associated  with  the 
new  variable  and  the  original  variable. 

Now  we  present  the  complete  algorithm  for  unifying  general  terms  with  associative  and 
commutative  functions  using  the  algorithm  for  the  variable  only  case  above.  We  are  here 
concerned  with  terms  whose  function  is  associative  and  commutative  with  arbitrary  arguments, 
le,  arguments  that  may  contain  ordinary  (non-assodative,  non-commutative)  functions  or  f or 
other  functions  which  are  associative  and  commutative.  We  assume  the  presence  of  ordinary 
unification  to  deal  with  those  aspects  of  the  unification  problem  not  dealt  with  explicitly  here. 
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First,  when  unifying  two  terms,  two  new  terms  with  only  variable  arguments  are  formed 
by  uniformly  replacing  distinct  arguments  by  new  variables.  These  new  terms  have  only 
variable  arguments  and  are  generalizations  of  the  original  two  t^rms.  For  example,  in  unifying 
f(xxya)  and  f(bbz),  we  form  generalizations  f(x j x j X2X3 ) and  Hy j y j y2^  Wl**1  substitution 
{x|«-x,  x2«-y,  *3«-a,  yj«-b,  y2*“*}  instantiating  the  new  terms  to  the  original  terms. 

Next,  using  the  previous  algorithm  for  the  variable  only  case,  we  unify  the 
generalizations  of  the  original  terms.  This  has  already  been  done  for  the  example  above 
resulting  in  69  unifiers  as  stated  previously. 

Now  we  have  the  generalizations  of  the  two  original  terms,  a substitution  to  instantiate 
them  to  the  original  terms,  and  a complete  set  of  the  r unifiers.  Every  unifier  of  the  original 
terms  it  a simultaneous  instance  of  the  substitution  to  instantiate  the  generalizations  to  the 
original  terms  and  a unifier  of  the  generalizations.  So  all  that  is  necessary  to  get  all  the 
unifiers  of  the  original  terms  is  to  unify  (for  each  variable  being  substituted  for)  the  value  in 
the  substitution  and  the  value  in  the  unifiers. 

In  the  example,  X3  must  have  value  a and  yj  must  have  value  b.  Thus,  any  unifier  of 
f(X|X|X2X3)  and  f (y * y j y2^  which  assigns  to  43  or  y2  a non-variable,  i.e.,  a term  of  the  form 
f (...)  may  be  immediately  excluded  from  consideration  since  the  unification  of  it  with  the 
assignment  including  x3«-a  and  yj»-b  will  fail.  (This  constraint  could  be  applied  during  the 
generation  of  sums  of  solutions  to  the  equation  rather  than  afterwards.)  This  constraint 
eliminates  63  of  the  69  unifiers,  leaving  sums  (1)  {4,6},  (2)  {2,4,6},  (3)  {1,5,6},  (4)  {1,2, 5, 6}, 
(5)  {1,2,7},  and  (6)  {1,2, 6, 7}  with  associated  unifiers 

(1 ) (*i x2*"z4'  x3*z4t  y 1 4-z4»  y2H(z6z6H- 

(2)  {xj *-z6,  x2*-f(z274),  x3«-z4,  yj«-z4,  y2H(z2zgr6)}, 

(3)  {x,«-z6,  x2«-f(z5z5),  x3«-z,,  yj«-z5,  y2«-f(z,z6z6)}, 

w x2h(z2z5z5)*  x3*zl»  xrz5'  y2*,(zlz2z6z6)J 

(5)  {x,«-z7,  x2*-z2,  x3«-zj,  yi«-z7,  y2<-f(zjz2)}l  and 

(6)  {xj«-f(z6z7),  x2«-z2,  x3«-zj,  yj«-z7l  y2*-f(z,z2z6z6)}. 
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Unifying  each  of  these  with  {xj*-x,  x2*-y,  x3«-a,  yj*-b;  y2*_z J.  w«  obtain 

(1 ) no  unifier  since  z^a  and  z^*-b  are  not  unifiable, 

(2)  no  unifier  sinr*  Z4«-a  and  Z4«-b  are  not  unifiable, 

(3)  {x«-Zg,  y«-f(bb),  z«-f(azgZg)}  («  (y«-f(bb),  zH(axx)}), 

(4)  {x«-Zg,  y*-f(bbz2),  z*-f(az2zgzg)}  («  {y*-f(bbz2),  T«-f (az2xx)}), 

(5)  {x*-b,  y«-z2,  z*-f(az2)}  («  {x*-b,  z«-f(ay ))}),  and 

(6)  {xW(bzg),  y*-z2,  z*-f(az2zgzg)}  («  {xM(bzg),  zH(ayzgZg)}) 

This  is  a complete  set  of  unifiers  of  f(xxya)  and  f(bbz). 

Since  x3  and  yj  of  the  variable  only  case  correspond  to  a and  b respectively,  and  a and 
b are  not  unifiable,  any  sum  inducing  solution  4 to  the  equation  2xj»x2*x3  » 2yj»y2  can  be 
excluded  from  consideration  since  it  would  require  (as  in  (1)  and  (2)  above)  the  unification  of 
a and  b As  with  the  constraint  on  variables  corresponding  to  non-variable  terms  not  being 
assigned  more  than  one  variable  (terms  Of  the  form  f(...))  in  the  variable  only  case,  this  latter 
constraint  on  solutions  can  be  applied  during  the  generation  of  unifiers  in  the  variable  only 
case  rath*>r  than  afterwards  Elimination  of  solution  4 before  generation  of  the  2n  sums,  and 
elimination  of  sums  which  do  not  meet  the  first  constraint  would  result  in  the  formation  only  of 
unifiers  (3),  (4),  (5),  and  (6)  of  the  variable  only  case,  each  of  which  has  a corresponding 
unifier  in  the  general  case 

More  precisely,  the  algorithm  consists  of  the  following  steps: 

1 . Form  generalizations  of  the  two  terms  replacing  each  distinct  argument  by  a new  variable. 

2.  Use  the  algorithm  for  the  variable  only  case  to  generate  unifiers  for  the  generalizations  of 
the  two  terms.  The  variable  only  case  algorithm  may  be  constrained  to  eliminate  the 
generation  of  unifiers  assigning  more  than  one  term  to  variables  whose  value  must  be  a single 
term,  and  the  generation  of  unifiers  which  will  require  the  later  unification  of  terms  which  are 
obviously  not  unifiable. 

3.  Unify  for  each  variable  in  the  substitution  from  step  1 and  the  unifiers  from  step  2 the 
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variable  values  and  return  the  resulting  assignments  for  variables  of  the  original  terms.  This 
is  a complete  set  of  unifiers  of  the  original  terms. 


£f00(  Of  Termination.  Soundness,  and  Completeness 
oLthe  AC  Unification  Algorithm 

We  will  first  establish  the  validity  of  eliminating  arguments  common  to  the  two  terms. 
This  will  be  done  by  proving  that  any  unifier  of  the  terms  is  a unifier  of  the  terms  with  a pair 
of  common  arguments  removed  and  vice  versa. 

Theorem.  Let  •i,...^m,t|,...,tn  be  terms  with  Sj*tj  for  some  i,j.  Let  f be  a unifier  of 
fUj.  -cJ  and  f(t,  .t.,).  and  let  * be  a unifir  / of  f(«j .. ej.j .. sm)  and  f(t, ...tj. , tj* j ...tn).  Then 

(i ) is  a unifier  of  f(sj ...s^., ...*m)  and  f(t, .. tj.j tjM . and  (2)  a is  a unifier  of  f(sj...sm) 
and  f(tj 

Proof. 


1.  ,(«j#f(«1....i.1.jM, ...Sm>#>  * f(s,...sm)|  - f(t, ...tn)d  r f(ti#f(t,...tj.1tj,1...tn5#)I  and  SjLtj#. 
Therefore  f(sj ...Sj.|Sj#i ...sm>#  * f(tj  -tj-i tj*j  tn)f  and  9 is  a unifier  of  f(sj Sj^j and 


2.  f(»|...«j.|Sj*j...«mV  « f(ti  .*j_jtj*|...tn)«r  and  Sj»*tj».  Therefore 

f(«i»f(a,...si.1si,1...sm)r)  . f(s,...smV  « f(t,...tn)a  • f(tj<rf(t1...tj.1tjM...tn)e)  and  a is  a unifier  of 
f(i1...sm)  and  <«!„.(„).  QED. 


The  lemma  below  establishes  that  every  non-negative  integral  solution  tc  an  equat  on  of 
the  form  ajXj»...amxm  * b j y j ♦...♦bnyn  is  composable  as  a (non-negative  integral  weighted) 
sum  of  a fixed  firite  set  of  non-negative  integral  solutions.  It  also  establishes  a solution  value 
within  which  all  the  non-negative  integral  solutions  in  the  set  may  be  found. 

Umm  Every  non-negative  integral  solution  (x1,..,xfn,y1,..,yn)  to  the  equation 
* 1 x 1 * *amxm  “ bl)'r-*bn>,n  with  positive  integral  coefficients  »i , .,«m,bj ,...,bn  is  an  additive 
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linear  combination  of  non-negative  integral  solutions  with  vilue  (»b j y j ♦ ,«bnyn) 

loss  than  or  equal  to  the  maximum  of  m and  n tirrvi  the  maximum  of  the  least  common 
multiples  of  pairs  of  numbers  One  from  a | ,m*  0 '•  *rom  bj,...,t>n. 

Proof  Assume  with  no  loss  of  generality  It  it  the  least  common  multiple  (Icm)  of  rs j and 
bj  is  the  maximum  of  the  least  common  multiples  and  that  m£n. 

Proof  by  inojction  on  the  value  of  a solution  k. 

k«0  The  solution  with  k*0  with  xj«0,  xm*0,  y | *0,  yn*0  is  generabie  as  the 

additive  linear  combination  of  non-negative  integral  solutions  with  value  less  than  ?r  »>qual  to 
m*lcm(a|,b|)  with  zero  coefficients. 

Assume  the  lemma  is  true  for  every  non-negative  integral  solution  with  value  less  than 
or  equal  to  k.  Prove  it  is  true  for  k. 

Case  1.  k < m*lcm(aj  ,bj ).  In  this  casa,  the  solution  is  included  among  the  non-negative 
integral  solutions  with  value  less  than  or  equal  to  m*lcm(f|,bj)  and  the  lemma  is  true. 

Case  2.  k > m*lcm(aj  ,bj ).  Since  ajXj«...*amxm  « k > m*lcm(aj  ,bj ),  and  each  ajXj>0,  at 
least  one  afXj  must  be  greater  than  lcm(a j ,b| ),  and  x(  must  be  greater  than  lcm(aj  ,bj  )/aj. 
Similarly,  since  b j y j «bnyn  * k > m*lcm(a  j ,b  j ),  and  each  bjyj>0,  and  n<m,  at  least  one  bjyj 
must  be  greater  th,»n  lcm(aj,bj),  and  yj  must  be  greater  than  lcm(a j ,b j )/bj.  Consider  the 
solution  with  Xj*lcm(aj,bj)/aj,  yj=lcm(aj,bj)/bj,  and  all  other  variables  zero.  This  is  just  the 
solution  in  lowest  terms  involving  only  x,-  and  yj  and  has  value  lcm(aj,bj)  < lcm(aj,b|)  Since 
lcm(aj  ,bj  )/»j  > lcm(aj,bj)/aj  and  lcm(aj  ,bj  )/bj  > lcm(aj,bj)/bj  by  the  maximality  of  lcm(a|,b|), 
the  solution  involving  only  Xj  and  y^  can  be  subtracted  from  the  solution  with  value  k leaving  a 
non-negative  integral  solution  as  result.  But  this  difference  solution  has  value  k-lcm(aj.bj)  < k 
and  is  thus  composable  from  solutions  with  value  less  than  or  equal  to  m*lcm(aj ,bj ). 
Therefore,  the  solution  with  value  k > m*lcm(aj,b|)  is  the  sum  of  some  solution  involving  only 
Xj  and  yj  with  value  less  than  or  equal  to  lcm(aj,b|)  and  some  other  set  of  solutions  with 
value  less  than  or  equal  to  m*lcm(aj,b])  and  the  lemma  is  true  for  this  case.  QED. 
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The  lemma  proves  an  upper  bound  on  solution  values  that  must  be  examined  in  the 
determination  of  a complete  set  of  non-negative  integral  solutions  which  span  the 
non-negative  integral  solution  space  by  addition.  We  believe  that  tighter  bounds  can  be 
proved  Although  a proof  for  a tighter  bound  would  be  desirable,  it  should  be  noted  that  a 
lower  proven  bound  would  not  reduce  the  number  of  found  solutions  theoretically  necessary, 
but  only  decreases  the  cost  of  computing  them,  and  would  have  no  effect  on  the  fo^m  or 
number  of  unifie  s returned  by  the  algorithm.  This  is  true  since  any  additional  solutions 
discovered  using  a I igher  bound  than  necessary  must  oe  composable  from  solutions  bounded 
by  any  proven  lower  bound  and  would  therefore  be  recognized  as  redundant  and  be  omitted. 

The  maximum  of  the  least  common  multiples  of  the  coefficients  one  from  the  left  side 
and  one  from  the  right  side  of  the  equation  is  a lower  bound  on  solution  values  which  must  be 
examined,  i e , solutions  with  at  least  this  value  must  be  examined  This  is  because  one  of  the 
needed  solutions  not  otherwise  generable  is  the  solution  involving  only  the  variables  with 
those  two  coefficients  with  maximum  least  common  multiple  and  having  value  equal  to  the 
maximum  least  common  multiple 

Theorem  The  AC  unification  algorithm  for  terms  with  associative  and  commutative 
function  with  only  variables  as  arguments  always  terminates,  is  sound  (returns  no  substitutions 
which  are  not  unifiers),  ard  is  complete  (every  unifier  is  an  instance  of  a returned  unifier). 

Proof  The  algorithm  is  guaranteed  to  terminate  since  it  performs  a finite  number  of 
operations  on  the  finite  number  of  non-negative  integral  solutions  generated  from  the  equation 
corresponding  to  the  two  terms.  The  generation  of  these  solutions  is  finite  due  to  the  trial 
solution  values  being  bounded. 

Tho  algorithm  is  sound  since  each  sot"'  un  of  the  derived  equation  causes  the 
introduction  into  each  of  the  instantiated  terms  of  an  equal  number  of  new  variable 
occurrences  Thus,  the  >wo  instantiated  terms  have  the  same  number  of  occurrences  of  each 
new  variable  and  are  therefore  unified 
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Any  unifier  must  assign  to  each  variable  a term  of  the  form  t,  (whose  function  symbol  is 
not  <)  or  a term  <(tj nl...tmnm)  (with  n,  occurrences  of  term  t(  ns  arguments  of  f).  Let  k be  the 
cardinality  of  the  tet  of  such  terms  tf  in  any  solution  to  the  unification  of  a pair  of  terms  with 
only  variables  as  arguments  The  two  instantiated  ,erms  must  have  an  equal  number  of 
occurrences  of  each  01  these  k terms  as  arguments  of  f That  is, 

“lcil  * *amcim  “ bldil*  *bndm  wb#r«  m is  the  number  of  distinct  variables  in  the  first 

term  being  unified,  n is  the  number  of  distinct  variables  in  the  second  term,  aj  is  the 
multiplicity  of  the  j,h  variable  in  the  first  term,  bj  is  the  multiplicity  of  the  j,h  variable  in  the 
second  term,  Cjj  is  the  number  of  occurrences  of  term  i in  variable  j in  the  first  term,  and  d;j 
is  the  number  of  occurrences  of  term  i in  variable  j in  the  second  term 

Each  tuple  <cj j . .cim|dil • »din>  is  a solution  to  the  equation 
alxl  * 4amxm  “ bl  Vl  * ,bnyn  "responding  to  the  terms  being  unified  It  can  thus  (according 
to  the  lemma)  be  formed  as  the  sum  of  certain  non-negative  integral  solutions  to  the  equation 
weighted  by  positive  integers. 

Consider  the  unifier  corresponding  to  the  sum  of  all  those  solutions  to  the  equation 
which  are  required  in  the  formation  of  any  of  the  tuples  (c, , ..Cj^dj,  ,..,Hjn)  We  will  show 
that  the  hypothesized  unifier  is  an  instance  of  this  unifier  returned  by  the  algorithm. 

Include  in  the  value  of  the  new  variable  associated  with  each  of  these  solutions  e 
number  of  occurrences  of  term  i equal  to  the  coefficient  of  the  solution  in  the  weighted  sum. 
This  will  result  in  the  proper  assignment  of  c(j  occurrences  of  term  i to  each  variablo  j of  the 
Jirst  term  and  djj  occurrences  of  term  i to  each  variable  j of  the  second  term. 

Do  this  for  each  of  the  k terms  in  the  solution.  Let  no  other  or  additional  terms  be 
included  in  the  values  of  the  new  variables. 

This  assignment  of  terms  in  the  solution  to  new  variables  associated  with  equation 
solutions  generated  in  the  unification  process  results  in  the  correct  number  Cjj  or  dfJ  of  each 
term  being  assigned  to  each  variable  of  the  original  two  terms. 
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Thus,  any  solution  to  the  unification  of  two  terms  with  only  variables  as  arguments  is  en 
instance  of  a returned  unifier  and  the  algorithm  is  complete  QED 

lhflflfflm  The  AC  unification  algorithm  for  general  terms  with  associative  and 
commutative  function  always  terminates,  is  sound,  and  is  complete 

Ptoof  Let  s and  t be  any  two  terms  bemg  unified  Let  s*  and  t*  be  the  terms 

resulting  from  replacing  each  d.stinct  term  by  a new  variable  s*  and  t*  are  genei  alizations 

of  s and  t respectively,  ie  , s*0=s  *t-\  for  some  6 of  Lie  form  { ,Xj*-c),...}  where  each  x( 

is  a new  variable  and  each  is  the  term  in  s or  t it  replaces  in  s*  or  t* 

Let  {<r^}  denote  the  unifiers  of  s*  and  t*  returned  by  the  indication  algorithm  for  terms 
with  associative  and  commutative  function  with  only  variables  as  arguments  Each  r-  is  of  the 
form  j ,x(«-dj,  } where  each  x,  is  a variable  of  s*  or  t*  and  d(  is  the  term  assigned  to  it  by 

the  unification  algorithm  According  to  the  previous  theorem,  unification  terminates,  is  sound, 

and  is  complete  for  this  case 

Simultaneous  instances  of  9 and  o-  represent  unifiers  of  s and  t since  s*0=s,  t*0=t,  and 

‘ V1  ’/ 

dnifying  each  cf  with  each  dj  of  a returned  unifier  v of  s*  and  t>::  results  in  (by  the 
assumption  of  termination,  soundness,  and  completeness  of  the  recursive  call  on  the  unification 
algorithm  tor  terms  of  lesser  complexity)  a complete  set  of  unifiers  for  the  original  terms  s 
and  t.  QED 


Conclusion 

We  have  presented  an  algorithm  for  unifying  general  terms  with  associative  and 
commutative  function.  We  have  proven  th.  the  algorithm  is  guaranteed  to  terminate,  is  sound, 
and  is  complete 

The  advantages  of  this  algorithm  as  compared  to  other  approaches  to  un;'ymg  such 
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terms  are  that  the  associativity  and  commutativity  properties  need  not  be  axiomatized  and  that 
all  the  urvifie  s o'  a pair  of  such  terms  are  immediately  returned  eliminating  the  unnecessary 
and  redundant  computation  often  occurring  in  other  approaches  which  generate  only  some  of 
the  unifiers  at  each  step  with  nc  indication  of  wnen  all  the  u'i>;ers  have  been  generated 
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